Etiology and risk factors for diarrheal disease amongst rural and peri-urban populations in Cambodia, 2012–2018

Diarrheal diseases are a leading cause of mortality and morbidity, disproportionally affecting persons residing in low and middle-income countries. Accessing high-resolution surveillance data to understand community-level etiology and risk remains challenging, particularly in remote and resource limited populations. A multi-year prospective cohort study was conducted in two rural and two peri-urban villages in Cambodia from 2012 to 2018 to describe the epidemiology and etiology of acute diarrheal diseases within the population. Suspected diarrheal episodes among participants were self-reported or detected via routine weekly household visits. Fresh stool and fecal swabs were tested, and acute-illness and follow-up participant questionnaires collected. Of 5027 enrolled participants, 1450 (28.8%) reported at least one diarrheal incident. A total of 4266 individual diarrhea case events were recorded. Diarrhea incidence rate was calculated to be 281.5 persons per 1000 population per year, with an event rate of 664.3 individual diarrhea events occurring per 1000 population per year. Pathogenic Escherichia coli, Aeromonas spp., and Plesiomonas shigelloides were the most prevalent bacterial infections identified. Hookworm and Strongyloides stercoralis were the predominant helminth species, while Blastocystis hominis and Giardia lamblia were the predominant protozoan species found. Norovirus genotype 2 was the predominant virus identified. Mixed infections of two or more pathogens were detected in 36.2% of positive cases. Risk analyses identified unemployed status increased diarrhea risk by 63% (HR = 1.63 [95% CI 1.46, 1.83]). Individuals without access to protected water sources or sanitation facilities were 59% (HR = 1.59 [95% CI 1.49, 1.69]) and 19% (HR = 1.19 [95% CI 1.12, 1.28]) greater risk of contracting diarrhea, respectively. Patient-level surveillance data captured in this long-term study has generated a unique spatiotemporal profile of diarrheal disease in Cambodia. Understanding etiologies, together with associated epidemiological and community-level risk, provides valuable public health insight to support effective planning and delivery of appropriate local population-targeted interventions.

Introduction Diarrheal diseases are a leading cause of global mortality and morbidity [1,2], with a disproportionate impact suffered by persons residing in low and middle-income countries (LMIC), where access to health care, safe drinking water, and sanitation are limited [3][4][5][6]. Diarrheal diseases remain among the top five causes of mortality in Southeast Asia, accounting for 6% of all deaths [1], despite a decline in the overall disease burden within the last decade. Determining diarrheal diseases etiology is complex, given the large number of potential causative entero-pathogens, high occurrences of mixed infections, challenges associated with various detection methods, and the sensitivity of available diagnostics [7,8]. Environmental, economic, and behavioral risk factors associated with diarrheal diseases also vary, both geographically and within populations [9][10][11]. Characterizing the epidemiology and etiology of diarrheal diseases will aid health professionals and policy makers to review existing strategies, guide future health planning, and target appropriate interventions [7,[10][11][12][13].
Despite widespread global occurrence of diarrheal diseases, the availability of detailed data to evaluate etiology and risk is often limited, particularly at the community level [5,13,14]. Lack of high-resolution epidemiologic data is amplified in areas of highest burden such as remote and resource-poor settings and populations with limited access to healthcare and risk factors could be different when compared to other settings. Patient-level surveillance datasets provide an optimal source to characterize disease incidence, etiology, and associated risk [14].
In Cambodia, the burden of diarrheal disease is significant and remains a leading cause of mortality in children and adolescents and is a top ten contributor to disability-adjusted life years (DALYs) for all ages [1]. To better characterize diarrheal threats in Cambodia, a multiyear prospective cohort study was conducted in four villages. Aims of the study were to monitor and describe the burden of diarrheal diseases among rural and peri-urban populations, and describe the epidemiology and etiology of acute diarrheal infections amongst village cohorts in Kampong Cham and Tbong Khum provinces in Cambodia.

Study site
The study was conducted between August 2012 and March 2018, in two peri-urban Cambodian villages in Kampong Cham province (Trapeang Chhuk and Roveang), and two rural villages in Tbong Khumum province (Chong Angkrang and La Ork) (Fig 1). Peri-urban sites in this study were characterized as township-centered settlements with proportionally higher levels of household densities and suburban development, as compared to the rural study site villages characterized by lower household densities and higher proportions of agricultural land. Kampong Cham and Tbong Khumum provinces were selected due to their agricultural-commercial significance within Cambodia and the region, located in the Mekong plains region northeast of the capital Phnom Penh and adjacent to the border with Vietnam. The four representative study site villages within these provinces were selected by the study team due to their geographic diversity, local community interest, and high rates of participation in previous studies.

Study design
Acute disease surveillance program. Participants were recruited for the study through an acute-disease surveillance program established in 2012 as part of a long-term collaboration between the Royal Cambodian MoH, the Cambodian National Institute of Public Health (NIPH), and the United States Naval Medical Research Unit-2 (NAMRU-2). Key objectives of this surveillance program were to support MoH led public health measures through the implementation of high confidence diagnostic testing protocols to measure and monitor rates of febrile vector-borne illness (FVBI), acute respiratory, and diarrheal diseases within the target populations, and identify associated responsible pathogens and risk factors.
Participant enrolment and eligibility. Individuals were enrolled into the study during an initial house-to-house census. The study design allowed for the ongoing enrollment of additional participants throughout the course of the observation period as part of annual household information collection and census activities that were conducted in each study village. Inclusion criteria for individuals were age of six months or older, resided in their respective village for six months or more prior to enrolment, physically lived in the village for more than 80% of time throughout the duration of the study, and provided voluntary consent to participate in and comply with all aspects of the study. A meal supplementation, at an approximate value of US$2-3, was offered as an incentive to participants. Data collection and screening. Following enrollment, demographics of each household member, as well as relevant household characteristics pertaining to water, sanitation, hygiene, and the presence of domesticated animals, both pets and livestock, were collected. One adult "primary contact" was identified for each household, was issued a thermometer, and provided instructions on how to identify suspected fever and capture body temperatures of study participants within their household, as well as him or herself, and contacted study staff if a member of the household had diarrhea or other concerning symptoms.
Suspected diarrheal events were reported by the household primary contact to one of 17 local village study field staff who had been trained in interviewing, data gathering, and specimen collection. Routine weekly visits to each household were also conducted by study field staff, where participants were interviewed and asked to self-report any recent occurrences or symptoms of fever or diarrhea. Field staff also tested participants for fever using an axillary thermometer and updated a weekly follow-up log, including household demographics and participant status to maintain an accurate register of study participants upon each study visit. Individuals confirmed to have had a fever or acute diarrhea, either identified and reported by a primary contact to study field staff, or identified during routine weekly visits, were classified as a case, and visited in their home by a study physician and laboratory assistant for an acute illness investigation. As part of these in-home medical assessments, an acute illness questionnaire was completed and biological specimens including fresh stool and fecal swabs were collected for laboratory testing. Patients who were unable or unwilling to provide a biological specimen during the in-home medical assessment were issued a stool collection kit and asked to provide a specimen within 48 hours. Specific acute illness follow-up visits (approximately 30 and 60 days later) were also conducted. Data collected during these visits included information regarding illness outcomes, additional medical care that may have been sought, such as hospitalization, and any treatments that may have been received.

Definition of acute diarrhea and fever
For this study, an acute diarrheal case was defined as three or more loose or liquid stools within a 24-hour period, or two loose or liquid stools accompanied by at least two additional gastrointestinal symptoms (such as nausea and/or abdominal cramping) within a 24-hour period, and not lasting more than 14 days. Dysentery was defined as the presence of blood in diarrhea within the previous 10 days. Fever was defined as an axillary temperature �37.5˚C, or tympanic temperature �38˚C.

Specimen collection, transportation, and analysis
Fresh stool and fecal swabs were collected on all study participants. Within one hour of collection, a sample of fresh stool was added to an ova and parasite (O&P) preparation tube for microscopic analysis and molecular characterization. Specimens for microscopic analysis were aliquoted into 10% formalin and polyvinyl alcohol (PVA)-containing medium and transported at room temperature to identify the presence of parasites using microscopy. The remainder of fresh stool was aliquoted into cryovials and transported at 4˚C to the NAMRU-2 laboratory, and frozen at -70˚C within two hours for future testing by various molecular techniques adapted from previous reports [16][17][18][19][20].

Data management and statistical analysis
Number of diarrhea cases, proportions, and incidence rates were used to describe the epidemiological and etiological characteristics of the study population. Categorical data were analyzed using Chi-square statistics (expected cell frequency > 5) to measure the association between risk factors and disease and log rank test for survival model to calculate incidence rates (95% confidence interval), with statistical significance set at p < 0.05.
Multi-event survival regression modeling (Cox survival analysis) was used to conduct multi-variate risk analysis and calculate the hazard ratio (HR, 95% confidence interval) and covariate coefficient to the outcome of a diarrhea event [21]. Given the possibility of a participant experiencing diarrhea more than once during the study period, diarrhea cases were considered as recurring events; with an individual diarrhea event treated as a time-varying covariate; and enrolments as multiple failure-time (multiple-event survival) data [22]. To support data modeling, if a participant did not experience a diarrhea event throughout the study, they were assigned a "No" value as a censored status for diarrhea. If a participant had multiple visits without reporting diarrhea, the database dropped all visit observations except for the last visit. If a participant experienced a diarrhea event, they were assigned a "Yes" value for the status of diarrhea. If a participant experienced multiple diarrhea events, then the number of observations for that participant was equal to the number of diarrhea events.
Data were analyzed using Stata software (StataCorp, College Station, TX, USA). Study participants who died or withdrew were right censored upon confirmation.
Risk factor variables. To support this data analysis, participant access to water sources were categorized as protected or unprotected. Protected water sources included rainwater, and government regulated water supply wells or tap water. Unprotected water sources included non-regulated sources such as rivers, lakes, or open wells. Sanitation was categorized as access to a toilet facility compared to no access to a toilet facility. Access to a toilet included access to a modern flush toilet, septic, or pit toilet. Treatment of water before drinking was also categorized into treated or untreated, with treated including boiling, filtering, or chemical purification. Untreated water was classified as drinking water directly from its source without intervention. Occupation was categorized as skilled, unskilled, or unemployed, with skilled occupations including business and professional workers, government employees, and students, and unskilled workers including farming, factory, construction, and restaurant workers. Seasonality was categorized as dry and wet seasons, with the wet season typically taking place between May and September, and the dry season from October through April.

Ethical clearance and consent
The study protocol was approved by the Kingdom of Cambodia's National Ethics Committee for Health Research and the Naval Medical Research Center's Institutional Review Board, project number (NAMRU2.2012.0001) in compliance with all applicable federal regulations governing the protection of human subjects. Eligible participants � 18 years of age provided written informed consent, while the parent / legal guardian of non-adult participants (< 18 years of age) provided permission and the dependent assent for participation.

Inclusivity in global research
Additional information regarding the ethical, cultural, and scientific considerations specific to inclusivity in global research is included in the Supporting Information (S1 Questionnaire)

Results
A total of 5027 participants were included in the study, from a potential 5500 individuals surveyed across 1139 households. Among those enrolled, 455 (8.27%) individuals did not meet the inclusion criteria and 18 (0.33%) refused to participate and were excluded from the study. Throughout the course of the five and one-half-year study period, 231 (4.60%) participants were not followed through the conclusion of the study either due to death (100; 1.99%) or moving away from their respective study village (131; 2.61%). A total of 4266 diarrhea cases were reported during the study. Fig 2 provides a flow-chart illustration of participation, rolling recruitment, and diarrhea cases captured throughout the study period. Table 1 provides a breakdown of participant demographics. Of the 5027 total study participants, 2490 (49.5%) reported a FVBI, acute respiratory, or diarrheal disease illness as part of the acute surveillance program and were classified as a patient. A total of 1450 (28.8%) reported at least one diarrhea case during the study. Over the course of the study period, the diarrhea incidence rate was calculated at 281.5 persons per 1000 population per year, with an event rate of 664.3 individual diarrhea events occurring per 1000 population per year.
Study participants with diarrhea had an average of 2.9 individual diarrhea events during the study period. The average length of time between each diarrhea event for a study patient was 1.4 years (p<0.001). Seasonality did not appear to influence the occurrence of multiple diarrhea case events, with 54.8% of study participants reporting diarrhea events in both the dry and wet seasons as opposed to only a specific season (p<0.001).
There were 4266 individual diarrhea case events reported in the study.

Frequency of single and mixed infections
Of the 4266 diarrhea cases recorded, 3841 cases, consisting of 3839 individual rectal swabs and 3764 individual fresh stools, were collected for analysis. Fresh stool samples or rectal swabs could not be collected during in-home medical assessments or within 48-hours in 425 cases due to patients being unable to provide or return a sample within the requisite time-period. From the 3841 diarrhea cases analyzed, 2726 (71.0%) were identified as positive for at least one stool pathogen (virus, parasite, or bacteria). Positivity rates were also statistically higher among females (72.8%; p = 0.002), children (76.1%; p<0.001), and patients without access to sanitation facilities (72.7%; p = 0.004). Mixed infections, defined as infections where two or more pathogens were detected, occurred in 36.2% of all positive cases. The most common type of mixed infection was parasite and bacterial, accounting for 69.8% of all mixed infections. Statistically higher rates of mixed infections were found in peri-urban areas (36.9%; p = 0.037), among children (41.5%; p<0.001), and during the dry season (40.8%; p<0.001). Table 4 provides a summary of stool specimen sample characteristics.

Bacterial gastrointestinal infections
Of the 3839 stool samples tested, 1460 (38.0%) samples were positive for enteric bacteria. Positivity percentages were statistically higher among patients living in rural settings (42.5%; p = 0.022), females (39.5%; p = 0.016), and children (42.4%; p<0.001). Positivity percentages collected during the dry season (43.3%; p<0.001) were significantly higher than those collected during the wet season (33.4%). 374 of 1460 (25.6%) of positive samples were found to have more than one GI bacteria species in their stool samples (

Parasitic gastrointestinal infections
Of the 3764 stool samples tested for ova and parasites, 1936 (51.4%) were positive. Positivity percentages were statistically higher among participants living in peri-urban settings (53.2%; p<0.001), females (53.2%; p = 0.006), children (54.3%; p = 0.004), and patients without access A summary of all pathogens identified in the study, their detection frequency and association with mixed infections is provided as supporting information (S1 Table).  Two distinct peaks in incidence and event rates during the study (Fig 3), with the first peak towards the end of 2012 and the second occurring in the third quarter of 2015. Following the second peak in 2015, both incidence and event rates declined sharply and remained at lower levels for the remainder of the study period relative to rates recorded from 2012 to 2015.  (31), Giardia lamblia (28), Campylobacter spp. (15), and pathogenic Escherichia coli (11) were the dominant pathogens detected during the second peak in 2015.

Discussion
The results from this study present an evaluation of diarrheal disease as part of a five and onehalf-year active surveillance study among a targeted population living in both peri-urban and rural settings in Cambodia. Key methodologic aspects of this long-term study, including the collection of detailed individual participant and household-level information, the adoption of both passive and active case-finding surveillance approaches, and the incorporation of laboratory-level diagnostics, allowed for the detection of diarrheal pathogens and detailed epidemiological data which could have been missed through typical passive case detection measures only. Similarly, the longitudinal component of this study allowed for the assessment of temporal changes of disease incidence and enabled patient-level tracking and recording of multiple infection events, providing insight into the trends and burden of diarrheal disease within the community.
With almost one in three study participants experiencing diarrhea, an event rate of 664.3 individual diarrhea events per 1000 population per year, and an average of 2.9 diarrhea case events among individuals reporting diarrhea, the impact of these diseases on the population are clear. Data presented in Table 2 indicate that diarrheal infections impacted both adults and children alike, with more than half of patients in each age category (except the 5-15-year age group) recording at least one diarrhea case in the study. Given the burden of diarrheal diseases, including the potential impacts these illnesses have on growth and development during early childhood, and in terms of DALYs [2,[23][24][25][26][27], understanding the likelihood of exposure events within a population has important considerations for health program planning.
In line with previous research [28][29][30], results from this study identified a statistical association between a participant reporting diarrhea and the use of unprotected water sources. Several related risk factors consistent with previous studies are also highlighted, including poor water quality and being unemployed [31][32][33], as well as findings linked to socioeconomic factors including living conditions, access to amenities and services, and education. Living in a periurban setting was also statistically associated with patients reporting diarrhea compared to patients living in a rural environment in this study. As standards among urban living conditions vary considerably based on a multitude of factors such as socioeconomics, levels of governance, and culture, specific data regarding peri-urban vs rural settings and diarrhea are somewhat limited. However, previous research on living conditions, crowding, and sanitation have demonstrated an increased association of diarrhea [34][35][36], and may be factors to consider for targeted public health interventions. Of note, while no statistical association between diarrhea and access to sanitation facilities was found in the bivariate analyses in this study, this factor had a calculated diarrhea hazard risk of 19% as part of the multi-event survival regression modelling. Previous research into sanitation and diarrhea indicated that additional factors such as maintaining cleanliness of sanitation facilities and promoting handwashing and good hygiene practice may be required to support a reduction in diarrhea, beyond simply improving access [31,35,36], suggesting further detailed investigation into these aspects of sanitation would be beneficial for health programs. Seasonality also impacted the risk of acquiring diarrhea. Despite a higher number of actual diarrhea cases reported during wet season periods over the course of the study period, the  calculated incidence rate of dry season infections was higher. Similarly, results of the multivariate analysis identified a reduction in the hazard risk of acquiring diarrhea during the wet season by 32%. Potential reasoning associated with these findings may include factors such as the concentration-dilution hypothesis where rainfall events occurring after dry periods may flush pathogens into surface water, whereas rainfall following wet periods may have a dilutive impact of pathogen concentrations [37]. In populations where large proportions of individuals lack access to protected water sources, such as in this study, these findings provide valuable insight for identifying at risk populations and potential high-risk transmission periods. Further research is required to analyze the potential causative factors contributing to increased prevalence of diarrhea. Several species of pathogenic E. coli (ETEC, EPEC, and EAEC) are highly infectious to humans, and since such a high percentage of acute infections in our study were found to contain these bacterial species, it is reasonable to deduce that they were likely the primary causative agent even among the many mixed infections, including those with co-infection or colonization with other pathogens such as Blastocystis hominis. Species of pathogenic E. coli in this study were found to be more common in children during the dry season in peri-urban areas closer to animals and areas of food production, which is logical as pathogenic E. coli is a food-borne bacteria known to cause serious GI illness. Poorly cooked or raw foodstuffs can contain ETEC, EPEC, and/or EAEC, along with other bacteria and/or parasites; therefore, better food safety initiatives in peri-urban areas could lead to fewer incidents of GI illness, especially in children. Shigella infections resulted in 43 (26%) of all bloody diarrhea samples collected in this study, and therefore is important to understand when diagnosing GI illness and considering antibiotic treatment and a patient presents with hematemesis or bloody dysentery. The water-borne parasite Giardia spp., which is known to be found in contaminated food, soil, and water tainted with animal feces was, unsurprisingly, found most frequently in children in our study. Parasitic disease prevention programs, including comprehensive education for young people regarding personal hygiene and properly cooking food and sterilizing water, is critical to reduce acute GI illness in non-urban areas.
As illustrated in Fig 3, a key finding from the study was the observed decline in overall diarrheal incidence over the course of the study period, with a significant reduction recorded between the third and fourth quarter of 2015. This reduction and the subsequent low incidence rate reported through to the end of the study period could potentially suggest a beneficial impact resulting from a targeted intervention; however, further investigation would be required to validate this. Of note, early during this surveillance study, in 2014, was the commencement of yearly meetings between the study site Provincial Health Departments and the Cambodia Ministry of Health to review data generated from the study and discuss diarrhea prevention programs. An example targeted response interventions derived from these yearly meetings included the deployment of health personnel in 2016 to Kampong Cham province to investigate diarrheal infection and subsequently recommend interventions including the targeted chlorination of water sources, and implementation of healthy food preparation and safe storage education and awareness programs [38]. Further research is required to confirm the impacts of such targeted prevention and control interventions on diarrhea incidence within Cambodia as this study was not designed to determine the effectiveness of an intervention, but instead to observe and describe incidence.
Limitations associated with this study include potential bias associated with the self-reporting component of the study methodology. While efforts were made by study personnel to adequately guide household primary contacts on illness identification and reporting procures, and to confirm self-reported illnesses during the routine weekly visits, the potential for the misidentification of potential cases should be noted. Given the length of the study, from a statistical perspective several factors were likely to have changed over time that could not be accounted for in the regression modelling. The regression model analysis used the last state of these factors, and as such does not indicate an association in the change of factors to the outcomes over time.
Aligning the etiologies of diarrheal disease with an understanding of the epidemiological and demographic risk profiles and transmission characteristics within a population provided valuable public health insight. Given the global significance of diarrheal disease, particularly in low and middle-income countries where health systems and disease surveillance systems are often strained, studies that support the identification and characterization of etiologies and associated risk factors leading to infection are essential. Data generated from this study provides evidence-based data for health programs to strengthen the planning and the delivery of appropriate and targeted community and population-specific interventions accordingly.
Supporting information S1 Questionnaire. Inclusivity in global research questionnaire. (DOCX) S1 Table. Summary of pathogens detected, their frequency, and association with mixed infections.